This is a hands-on session will show journalism students how to understand data and find stories using FBI crime data.
Learning Outcome:
—Understanding data
—Basic flaws with data sets
-Where to get data
-Storytelling data
—Data reporting now vs. before
Many government and business websites have data available for you to study. This is the federal government’s main open data portal. This will keep you entertained for a few days, to say the least
Many local governments have their own data portals which can lead to great stories.
Today, I will walk you through the FBI crime data for Maryland
The first thing to understand is the definitions of the columns and variables.
– A Word About UCR Data
– UCR Statistics: Their Proper Use (pdf)
Look for a data dictionary or a data schema
This data series lags significantly. We are working with 2018 data, the most recent available.
Other issues: Voluntary Guidelines
Not All Locales Report
Not all Locales Are the Same Data users should not rank locales because there are many factors that cause the nature and type of crime to vary from place to place
Rankings ignore the uniqueness of each locale
Since crime is a sociological phenomenon influenced by a variety of factors, the FBI discourages ranking the agencies and using the data as a measurement of law enforcement effectiveness
Population density and degree of urbanization. Variations in composition of the population, particularly youth concentration. Stability of the population with respect to residents’ mobility, commuting patterns, and transient factors. Economic conditions, including median income, poverty level, and job availability.
But it is still the best you can get and you can’t beat the price.
Download this table in Excel and open it.
Table 8: Offenses Known to Law Enforcement by City, 2018
Table 8: Violent Crime totals: Murder, Rape, Robbery, Aggravated Assault
Property Crime totals: Burglary, Larceny-theft, Motor vehicle theft, Arson.
You may see a notation (1) in Rape. In 2013, the term “forcible” was removed from the offense name, and the definition was changed FBI discontinuing reporting of rape data using the UCR legacy definition beginning in 2017.
Details are in Data Declaration: “CNTL” + click for a New Tab
Think of the basic news values when looking at this spreadsheet. Something big is news. Something unusual is news. Something person is news. So the murder rate in your community is an important fact.
>
> The answer for the place with the most violent crimes is obvious, but it is also the place with the most population. If we are to compare across cities – which has its own challenges – we want to construct a simple ratio.
>
This is not hard. A ratio is dividing one number into another. We will divide violent crimes into population to get a Violent Crime Per Capita ratio, or violent crimes per person. That number is a fraction, so we will arbitrarily multiply it by 1,000 so we can better understand the fraction.
So on TV or in our stories, will will say Baltimore had 18 violent crimes per 1,000 people, the top in the state. But small towns like Elkton were a problem too, with 13 violent crimes per 1,000 people.
We have adjusted the crimes to the population to get more context about crime among various communities.
To me, the news would be the small towns with above average crime.
>
>
The Google sheet, the FBI18 tab, has been cleaned and modified so it will play well with Tableau. Basically, the FBI spreadsheet was cleaned to remove the first four rows so Tableau will recognize the column headings (“Locality, Population, Violent Crime, etc”).
Here is a 10-minute video on how to do basic data cleaning, using another data set, but the principles apply. You can watch it later and amaze your friends:
And here is the end result in a Google Sheet. Add this sheet to your Google Drive or download it as a spreadsheet on your computer.
For this part, we will use Tableau Public to produce charts and a map. Download the app if you haven’t already
You will need to create a Tableau Public account.
Tableau accepts many data sources. Under “To a Server” there’s an option for Google Sheets. It will ask for access to your Google account and say ok.
Option #2: Download the FBI data to your computer as an Excel file, and then load it into Tableau.
With the Google Sheets link, select FBI18** and drag it to the pane that says “Drag sheets here.” **
Clean the data with the “Data Interpreter.” Click the box is right above the sheet listing.
Now, look at the data.
“Locality” should be the only things in text, designated in a blue “ABC.” That’s a dimension. The rest should be with a green # sign. Those are measures. The difference between dimensions (ABC) and measures (#) is crucial in Tableau.
We’re ready. Click on “Sheet 1” in lower lefthand corner.
Welcome to a Tableau Worksheet
- Drag the “Locality”" dimension to Rows
- Drag the “Violent Crime” measure to Columns
- Click Sort Descending Button in Toolbar (below “Dashboard” on menu)
* This is your basic graph of violent crime in Maryland in 2018.
Pro Tip: Duplicate and Iterate. Duplicate the Sheet 1: Left clicking on Sheet 1 tab. Select duplicate. More on Duplicate and Iterate later.
Filter to Maryland cities
+ To filter, drag Population (in Measures, left column) to Filters
+ Select All Values, Minimum Value at 22,578
+ The 22,578 will filter out the 20 largest Maryland cities.
* Create a Filter Slider + In The Filter Box, Select Population, click on down arrow on right + Click “Show Filter” + Filter with Slider appears in Right. + Adjust filter as you wish
- Crime per 1,000 residents
- Click Sheet 2
- Click on “Violent Crime” - select down arrow, see menu, select Create | Calculated Field
- You will create this calculation: [Violent Crime] / [Population] *1000
- Dialog Box Displays [Violent Crime]
- Add a division operator: [Violent Crime] /
- Drag Population to complete the equation: [Violent Crime] / [Population]
- Type *1000
- Rename it Violent Per 1000, click OK
- Drag the “Locality”" dimension to Rows
- Drag the “Violent Per 1000” measure to Columns
- Click Sort Descending Button in Toolbar (below “Dashboard” on menu)
- Drag Violent Per 1000 to Color in the Marks Bar
- Click Color, Edit Colors, Select Palette down arrow, select red-green diverging
- Select Reversed, Click OK
- Drag Violent Per 1000 to Label
- Make a headline
+ Double Click on the heading, "Sheet 2" and type in a headline.
+ Estill Has Highest Per Capita Crime Rate In SC
+ Violent Crime Per 1,000. Source: FBI Uniform Crime Report Data, 2018.
+ https://ucr.fbi.gov/crime-in-the-u.s/2018/crime-in-the-u.s.-2018/tables/table-8/table-8-state-cuts/south-carolina.xls
+ Use the formatting tools to adjust fonts and center.
Finished chart should look like this:
+ Format the numbers and legend. Look at the Marks card. Find the green pill representing the label "Sum(Violent Per 1000)." Click the down arrow in this label. Click Format.
+ If you had a percentage: The Format pane appears to the left. Under "Default," select "Numbers." Then select "Number - Custom." Select two decimal places and put a % sign in the suffix.
+ Format the Y axis legend. Left click on the Y axis to see menu. Select "Format." See the formatting pane on left: select "Axis." On "Scale," select "Numbers." Then select "Number - Custom." Select no decimals and put a % sign in the suffix.
+ Drag right border out to expand the chart. Mouse over right edge to find the graphic.
+ Rename the tab "Sheet 1" to "Arkansas Crime"
We’ll use the same data for a map.
First, we have to apply a geographic role to the Dimension “Locality.” This associates the city with geocoding so it can be applied to a map.
Notice in Measures that Latitude and Longitude measures are now created.
- Click on “new worksheet” tab at lower left. It creates a new Sheet.
- Drag Longitude to Columns, Latitude to Rows
- Drag “Locality” to the sheet. An Maryland map appears with blue dots.
- Note lower right hand corner: “84 unknown.” To fix this, we need to tell Tableau these cities are in Maryland
- In top menu, select Map | Edit Locations
- In state, select Fixed | Maryland
- FYI: Three others remain: Springdale, St. Stephen, Forest Acres. Long story on fixing those - involved editing latitude and longitude coordinates: Springdale: 33.960952, -81.109895
- Drag Violent per 1000 onto the map. Different sized dots appear
- Drag Violent per 1000 onto Color. Edit colors to Red-Green Diverging. Reverse the scale
- Click on size in Marks, adjust slider to enlarge the circles
- Double Click on the heading, “Sheet 2” and type in a headline.
- I typed "Maryland Crime, 2018
- Source: FBI Uniform Crime Report Data, 2018.
- https://ucr.fbi.gov/crime-in-the-u.s/2018/crime-in-the-u.s.-2018/tables/table-8/table-8-state-cuts/south-carolina.xls
- Use the formatting tools to adjust fonts and center.
- Remove Right Legend: Down arrow |Hide Card
- Drag color legend to x-axis, drag up top border to resize
Finished Product!
For example:
For example:
Story Function
The story function allows you to compile all of these separate sheet into one visual, interactive narrative. Click book icon in lower right to create a story. Add your various sheets to a story, creating one interactive graphic with multiple panels.
Duplicate and Iterate.
I built these multiple views by first formatting the first Arkansas sheet. Duplicate by left clicking on the sheet tab. Once copied, adjust the filter for your state or city. You can zero in on specific items and build specific data visualizations for industries or cities.
Get the embed code from Tableau Public and add it to your blog or website.
Now, go experiment and figure out some other visualization!
Thank you.
###Follow up questions:
–30–